Goto

Collaborating Authors

 google image


Shape-Based Single Object Classification Using Ensemble Method Classifiers

Kamarudin, Nur Shazwani, Makhtar, Mokhairi, Shamsuddin, Syadiah Nor Wan, Fadzli, Syed Abdullah

arXiv.org Artificial Intelligence

Nowadays, more and more images are available. Annotation and retrieval of the images pose classification problems, where each class is defined as the group of database images labelled with a common semantic label. Various systems have been proposed for content-based retrieval, as well as for image classification and indexing. In this paper, a hierarchical classification framework has been proposed for bridging the semantic gap effectively and achieving multi-category image classification. A well known pre-processing and post-processing method was used and applied to three problems; image segmentation, object identification and image classification. The method was applied to classify single object images from Amazon and Google datasets. The classification was tested for four different classifiers; BayesNetwork (BN), Random Forest (RF), Bagging and Vote. The estimated classification accuracies ranged from 20% to 99% (using 10-fold cross validation). The Bagging classifier presents the best performance, followed by the Random Forest classifier.


GDPO: Learning to Directly Align Language Models with Diversity Using GFlowNets

Kwon, Oh Joon, Matsunaga, Daiki E., Kim, Kee-Eung

arXiv.org Artificial Intelligence

A critical component of the current generation of language models is preference alignment, which aims to precisely control the model's behavior to meet human needs and values. The most notable among such methods is Reinforcement Learning with Human Feedback (RLHF) and its offline variant Direct Preference Optimization (DPO), both of which seek to maximize a reward model based on human preferences. In particular, DPO derives reward signals directly from the offline preference data, but in doing so overfits the reward signals and generates suboptimal responses that may contain human biases in the dataset. In this work, we propose a practical application of a diversity-seeking RL algorithm called GFlowNet-DPO (GDPO) in an offline preference alignment setting to curtail such challenges. Empirical results show GDPO can generate far more diverse responses than the baseline methods that are still relatively aligned with human values in dialog generation and summarization tasks.


Can LLMs Generate Visualizations with Dataless Prompts?

Coelho, Darius, Barot, Harshit, Rathod, Naitik, Mueller, Klaus

arXiv.org Artificial Intelligence

Recent advancements in large language models have revolutionized information access, as these models harness data available on the web to address complex queries, becoming the preferred information source for many users. In certain cases, queries are about publicly available data, which can be effectively answered with data visualizations. In this paper, we investigate the ability of large language models to provide accurate data and relevant visualizations in response to such queries. Specifically, we investigate the ability of GPT-3 and GPT-4 to generate visualizations with dataless prompts, where no data accompanies the query. We evaluate the results of the models by comparing them to visualization cheat sheets created by visualization experts.


Smiling Women Pitching Down: Auditing Representational and Presentational Gender Biases in Image Generative AI

Sun, Luhang, Wei, Mian, Sun, Yibing, Suh, Yoo Ji, Shen, Liwei, Yang, Sijia

arXiv.org Artificial Intelligence

Generative AI models like DALL-E 2 can interpret textual prompts and generate high-quality images exhibiting human creativity. Though public enthusiasm is booming, systematic auditing of potential gender biases in AI-generated images remains scarce. We addressed this gap by examining the prevalence of two occupational gender biases (representational and presentational biases) in 15,300 DALL-E 2 images spanning 153 occupations, and assessed potential bias amplification by benchmarking against 2021 census labor statistics and Google Images. Our findings reveal that DALL-E 2 underrepresents women in male-dominated fields while overrepresenting them in female-dominated occupations. Additionally, DALL-E 2 images tend to depict more women than men with smiling faces and downward-pitching heads, particularly in female-dominated (vs. male-dominated) occupations. Our computational algorithm auditing study demonstrates more pronounced representational and presentational biases in DALL-E 2 compared to Google Images and calls for feminist interventions to prevent such bias-laden AI-generated images to feedback into the media ecology.


Scrape and Download Google Images with Python

#artificialintelligence

The increasing prevalence of web scraping has increased its usage areas considerably. There is a regular and uninterrupted flow of data from target websites to the data sets of artificial intelligence applications. Image processing is one of the most popular areas in artificial intelligence applications. Image processing is a field of computer science that focuses on enabling computers to identify and understand objects and people in images and videos. Like other types of artificial intelligence, image processing aims to perform and automate tasks that replicate human capabilities.


Open Source Projects for Machine Learning Enthusiasts

#artificialintelligence

Open source refers to something people can modify and share because they are accessible to everyone. You can use the work in new ways, integrate it into a larger project, or find a new work based on the original. Open source promotes the free exchange of ideas within a community to build creative and technological innovations or ideas. It helps you to write cleaner code. That can be of any choice.


Google's AI looks beneath the surface for information about people, places, and things in images

#artificialintelligence

Google today announced it will begin showing quick facts related to photos in Google Images, enabled by AI. Starting this week in the U.S., users who search for images on mobile might see information from Google's Knowledge Graph -- Google's database of billions of facts -- including people, places, or things germane to specific pictures. Google says the new feature, which will start to appear on some photos within Google Images before expanding to more languages and surfaces over time, is intended to provide context around both images and the webpages hosting them. It's estimated that images currently make up 12.4% of search queries on Google, and at least a portion of these are irrelevant or manipulated. In an effort to address this, Google earlier this year began identifying misleading photos in Google Images with a fact-check label, expanding the function beyond its standard non-image searches and video.


Measuring Social Biases in Grounded Vision and Language Embeddings

Ross, Candace, Katz, Boris, Barbu, Andrei

arXiv.org Artificial Intelligence

We generalize the notion of social biases from language embeddings to grounded vision and language embeddings. Biases are present in grounded embeddings, and indeed seem to be equally or more significant than for ungrounded embeddings. This is despite the fact that vision and language can suffer from different biases, which one might hope could attenuate the biases in both. Multiple ways exist to generalize metrics measuring bias in word embeddings to this new setting. We introduce the space of generalizations (Grounded-WEAT and Grounded-SEAT) and demonstrate that three generalizations answer different yet important questions about how biases, language, and vision interact. These metrics are used on a new dataset, the first for grounded bias, created by augmenting extending standard linguistic bias benchmarks with 10,228 images from COCO, Conceptual Captions, and Google Images. Dataset construction is challenging because vision datasets are themselves very biased. The presence of these biases in systems will begin to have real-world consequences as they are deployed, making carefully measuring bias and then mitigating it critical to building a fair society.


Training an emotion detector with transfer learning

#artificialintelligence

The first thing to do in any machine learning task is to collect the data. What we need are thousands of images with labeled facial expressions. The public FER dataset [1] is a great starting point with 28,709 labeled images. However, since the resolution of these images is only 48 x 48, it would be nice to also have a dataset with richer features. To do this, we will use the google_images_download python package to query and scrape data from Google Images.


Prepare your own data set for image classification in Python ML

#artificialintelligence

There is large amount of open source data sets available on the Internet for Machine Learning, but while managing your own project you may require your own data set. Today, let's discuss how can we prepare our own data set for Image Classification. The first and foremost task is to collect data (images). There are many browser plugins for downloading images in bulk from Google Images. Suppose you want to classify cars to bikes.